Don't require a full cython import to find PTX file #15785

brandon-b-miller · 2024-05-20T13:39:08Z

We shouldn't need to do a full import of the cudf cython for the purposes of computing a relative path. It leads to awkward import errors when things go wrong that are misleading as to what cuDF is trying to actually do. Here I've deliberately removed the arrow lib to cause an error.

>>> import cudf
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/raid/brmiller/mambaforge/envs/cudf_dev/lib/python3.11/site-packages/cudf/__init__.py", line 9, in <module>
    _setup_numba()
  File "/raid/brmiller/mambaforge/envs/cudf_dev/lib/python3.11/site-packages/cudf/utils/_numba.py", line 124, in _setup_numba
    _get_cc_60_ptx_file()
  File "/raid/brmiller/mambaforge/envs/cudf_dev/lib/python3.11/site-packages/cudf/utils/_numba.py", line 16, in _get_cc_60_ptx_file
    from cudf._lib import strings_udf
  File "/raid/brmiller/mambaforge/envs/cudf_dev/lib/python3.11/site-packages/cudf/_lib/__init__.py", line 4, in <module>
    from . import (
ImportError: libarrow.so.1600: cannot open shared object file: No such file or directory

It also convolutes the purpose and effects of _setup_numba, which should aspire to configure numba independently of cuDF, and limits our ability to customize when and how we're importing the cython in cudf's main __init__.

With this PR, the error will now be:

>>> import cudf
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/raid/brmiller/mambaforge/envs/cudf_dev/lib/python3.11/site-packages/cudf/__init__.py", line 19, in <module>
    from cudf import api, core, datasets, testing
  File "/raid/brmiller/mambaforge/envs/cudf_dev/lib/python3.11/site-packages/cudf/api/__init__.py", line 3, in <module>
    from cudf.api import extensions, types
  File "/raid/brmiller/mambaforge/envs/cudf_dev/lib/python3.11/site-packages/cudf/api/types.py", line 20, in <module>
    from cudf.core.dtypes import (  # noqa: F401
  File "/raid/brmiller/mambaforge/envs/cudf_dev/lib/python3.11/site-packages/cudf/core/dtypes.py", line 13, in <module>
    import pyarrow as pa
  File "/raid/brmiller/mambaforge/envs/cudf_dev/lib/python3.11/site-packages/pyarrow/__init__.py", line 65, in <module>
    import pyarrow.lib as _lib
ImportError: libarrow.so.1600: cannot open shared object file: No such file or directory

Which feels a bit more natural and easy to diagnose as something being wrong with arrow.

vyasr

Unfortunately this won't work for editable installs. The py files and the compiled extension modules are not side-by-side for the mode of editable installation we are using. The right way to make this work is to use importlib.resources to get all of the file paths without loading the module, but that will require some work upstream in scikit-build-core to support that.

brandon-b-miller · 2024-05-20T23:13:09Z

Unfortunately this won't work for editable installs. The py files and the compiled extension modules are not side-by-side for the mode of editable installation we are using. The right way to make this work is to use importlib.resources to get all of the file paths without loading the module, but that will require some work upstream in scikit-build-core to support that.

How are you currently performing an editable install? The PTX file locations aren't relative to any extension module, they should just be relative to the py files themselves.

vyasr · 2024-05-20T23:44:59Z

I'm doing an editable installation with pip install -e.... Try it out and see where files end up. I'm happy to show it to you on my machine if you'd like or if you have trouble reproducing the behavior I'm mentioning.

The problem is that the PTX files are generated at build time and installed via CMake. When scikit-build-core does an editable install, anything that gets installed by CMake is placed into a directory in site-packages, while any pure Python files are read directly from the source directory (as would be expected with an editable install). Observe:

(rapids) coder _ ~/cudf $ pwd
/home/coder/cudf
(rapids) coder _ ~/cudf $ find . -name "*.ptx"
./cpp/build/release/CMakeFiles/3.28.3/CompilerIdCUDA/tmp/CMakeCUDACompilerId.ptx
./cpp/build/conda/cuda-12.2/release/CMakeFiles/3.29.2/CompilerIdCUDA/tmp/CMakeCUDACompilerId.ptx
./cpp/build/conda/cuda-12.2/release/CMakeFiles/3.29.3/CompilerIdCUDA/tmp/CMakeCUDACompilerId.ptx
./python/cudf/build/cp310-cp310-linux_x86_64/CMakeFiles/3.29.2/CompilerIdCUDA/tmp/CMakeCUDACompilerId.ptx
./python/cudf/build/cp310-cp310-linux_x86_64/CMakeFiles/3.29.3/CompilerIdCUDA/tmp/CMakeCUDACompilerId.ptx
./python/cudf/build/cp310-cp310-linux_x86_64/udf_cpp/CMakeFiles/shim_60.dir/shim.ptx
./python/cudf/build/cp310-cp310-linux_x86_64/udf_cpp/CMakeFiles/shim_80.dir/shim.ptx
./python/cudf/build/cp310-cp310-linux_x86_64/udf_cpp/CMakeFiles/shim_75.dir/shim.ptx
./python/cudf/build/cp310-cp310-linux_x86_64/udf_cpp/CMakeFiles/shim_70.dir/shim.ptx
./python/cudf/build/cp310-cp310-linux_x86_64/udf_cpp/CMakeFiles/shim_90.dir/shim.ptx
./python/cudf/build/cp310-cp310-linux_x86_64/udf_cpp/CMakeFiles/shim_86.dir/shim.ptx
(rapids) coder _ ~/cudf $ find $(python -c "import site; print(site.getsitepackages()[0])")/cudf -name "*.ptx"
/home/coder/.conda/envs/rapids/lib/python3.10/site-packages/cudf/core/udf/shim_80.ptx
/home/coder/.conda/envs/rapids/lib/python3.10/site-packages/cudf/core/udf/shim_86.ptx
/home/coder/.conda/envs/rapids/lib/python3.10/site-packages/cudf/core/udf/shim_60.ptx
/home/coder/.conda/envs/rapids/lib/python3.10/site-packages/cudf/core/udf/shim_70.ptx
/home/coder/.conda/envs/rapids/lib/python3.10/site-packages/cudf/core/udf/shim_75.ptx
/home/coder/.conda/envs/rapids/lib/python3.10/site-packages/cudf/core/udf/shim_90.ptx

brandon-b-miller · 2024-05-21T00:32:19Z

I see - if the cython extension modules are the only thing we can guarantee the location of in an editable install, I'd like to propose we ship some kind of empty cython module along with _lib (_path perhaps?) for this purpose - one that does not import the full _lib namespace - what do you think @vyasr ? If that sounds like a reasonable path forward, I can adapt this PR for that goal, else, I'll create an issue where we can figure things out in more detail and come back to this later.

vyasr · 2024-05-21T17:56:28Z

I guess it depends, what exactly is the problem that you're trying to solve? Is there a way to delay the set up of the PTX files until after you know cudf has been imported? That way you wouldn't have any need to import cudf prematurely. We could add a stub library, but I feel like that's going to be more trouble than it's worth. If we did move forward with that, you'd want to put a pyx file in core/udf with a function get_ptx_files or something so that it can always be colocated with the PTX files.

I've asked in scikit-build about how best to do this since I do think that's where work needs to be done to properly enable support.

brandon-b-miller · 2024-05-21T18:09:37Z

We're only reading from a PTX file because the version of CUDA used to compile it is needed to determine if MVC is necessary. This is the case where our package is compiled with CUDA V.X and the user has driver version V.Y, where X > Y. Unfortunately I think that means that we can't import cudf first, since later on in the cuDF import process we import numba.cuda after which we can't patch the linker. So, this process must happen before the rest of cuDF initializes. I'm of the opinion that we should aspire to fix this, mostly because I think in the current state it might be confusing to users exactly what piece of the initialization process is going wrong.

vyasr · 2024-05-21T23:17:51Z

While we're waiting to hear back re:scikit-build, maybe a try-except could do the trick as a patch too? I'm not sure that I see a way for the empty lib approach to work because just doing import cudf._lib.empty_lib will trigger a full import of everything in cudf AFAIK. We import too many things in __init__.py files.

vyasr · 2024-07-19T23:42:35Z

I did some more investigating on the skbuild side and followed up with scikit-build/scikit-build-core#807 and scikit-build/scikit-build-core#808.

vyasr · 2024-09-26T22:36:49Z

I'm going to close this since we should really be aiming to fix this upstream in scikit-build-core.

brandon-b-miller added 3 commits May 20, 2024 06:05

compute path relative to current module

e04766f

small updates

2f8bb9d

small updates

b0b3fab

brandon-b-miller added bug Something isn't working Python Affects Python cuDF API. non-breaking Non-breaking change labels May 20, 2024

brandon-b-miller self-assigned this May 20, 2024

brandon-b-miller requested a review from a team as a code owner May 20, 2024 13:39

brandon-b-miller requested review from mroeschke and galipremsagar May 20, 2024 13:39

vyasr requested changes May 20, 2024

View reviewed changes

vyasr closed this Sep 26, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Don't require a full cython import to find PTX file #15785

Don't require a full cython import to find PTX file #15785

brandon-b-miller commented May 20, 2024

vyasr left a comment

brandon-b-miller commented May 20, 2024

vyasr commented May 20, 2024

brandon-b-miller commented May 21, 2024

vyasr commented May 21, 2024

brandon-b-miller commented May 21, 2024

vyasr commented May 21, 2024

vyasr commented Jul 19, 2024

vyasr commented Sep 26, 2024

Don't require a full cython import to find PTX file #15785

Don't require a full cython import to find PTX file #15785

Conversation

brandon-b-miller commented May 20, 2024

vyasr left a comment

Choose a reason for hiding this comment

brandon-b-miller commented May 20, 2024

vyasr commented May 20, 2024

brandon-b-miller commented May 21, 2024

vyasr commented May 21, 2024

brandon-b-miller commented May 21, 2024

vyasr commented May 21, 2024

vyasr commented Jul 19, 2024

vyasr commented Sep 26, 2024